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Three and one-half years of research on couputerized 
ability testing are suaaarized. The original objectives of the 
research vere: (1) to develop and iapleaent the stratified 
coaputer-based ability test; (2^ to coapare, on psychoaetric 
criteria, the various approaches to coaputer-based ability testing, 
including the stratified coaputerized test, the pyraaidal approach. 
Lord's flexilevel test, two stage testing, and soae aatheaatical 
aodels for coaputerized testing; (3) to deteraine the effect on 
ability test scores of aaintaining test iteas at a level of 
difficulty near the individual's estiaated ability level by aeans of 
coaputer controlled adainistration under one or aore of the stated 
strategies; {4y providing feedback of correctness of response to each 
itea within the istandard aodels for coaputerized testing by using a 
special variation of the stratified approach designed to insure 
various proportions of correct responses; and (5) to deteraine the 
utility for diagnostic purposes of inforaation on an individual's 
itea response latencies. The research approach is suaaarized and 
related to the eighteen technical reports produced under this, 
contract. Tventy-one aajor research findings are presented. The 
iaplications of the research findings and aethods for future research 
in coaputerized adaptive testing are described. Also included are 
abstracts of the eighteen technical reports derived froa this 
research. (Author/DEP). 
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FINAL REPORT: 

COMPUTERIZED ABILITY TESTING. 1972-1975 



Objectives 

The original objectives of the research vere: 

1. To develop and impleaent the stratified cocputer-based ability 
test* 

2. To coapare^ on psychoiaetric criteria, the various approaches 
to coKputer-bri3ed ability testing, lucluding: 

a* The stratified conputerized test 
h« The pyraaldal approach 

c. Lord's flexilevel test 

d. Two-stage testing 

e. Sone of the icatheiaatlcal laodels for computerized testlng- 

3. To detenaine the effect on ability test scores of: 

a. Maintaining test items at a level of difficulty near the 
individual's estimated ability level by seans of couputer- 
controlled administration under one or aore of the above 
strategies* 

b. Providing "feedliaclc" of correctness of response to each 
iten within tfie standard iDodels for computerized testing 
by using a special variation of the stratified approach 
designed to insure various proportions of correct responses. 

4. To determine the utility for diagnostic purposes of information 
on an individual' S'^ item response latencies. 

Research in pursuance of these objectives began in March 1972 
and continued through September 15, 1975. The research led to the 
publication of sixteen Technical Reports, with two more currently in 
preparation. Abstracts of all eighteen Technical Reports follow this 
overview of the research program. 



Approach 




Research began with a comprehensive review of the literature on 
adaptive or tailored testing (Research Report 73-1). This review 
identified four major research approaches to problems of adaptive 
testing. These included empirical (live- tea ting) studies, aonte carlo 
cp^mter slaailation studies, "real-data" simulation studies, and theoretical 
-studies* These research approaches were evaluated, and it was concluded 
that live-testing studies and monte carlo simlatlon studies provided- 
the most useful kinds of research information. This revlew%of the 
literature also led to the conclusion that very little vms known about the 
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various^ strategies of adaptive testing- The little evidence that vas 
available suggested that adaptive tests provided better aeasuresent 
than conventional tests under a variety of circumstances. But it was 
obvious that considerable research vas necessaicy In order to discover 
appropriate seasureaent applications, approaches, and theory relevant 
to adaptive testing. 

The review of the literature identified a nui^ber of different 
strategies of adapting ability tests to individuals. These strategies 
were described in Research Report 74-5. Each strategy vas evaluated 
in terms of its potential for providing eqiiiprecise aeasureaent 
(aeasuresents with equal precision at all levels of ability), and in 
terms of its feasibility tinder both paper-and-pepxil and cosputer 
administration . 

In order to provide an iten pool for live-testing studies, a pool 
of 575 wiltiple-choice word knowledge items vas calibrated (Research 
Report 74-2). From this pool, 369 items were xxsed in all subsequent 
live-testing studies implemented in the research program. Word knowledge 
items were selected because of their general use in "intelligence" 
tests and their appearance in almost all aajor aultiple-aptitude batteries. 
Development of the item pool and its refinement led to an analysis of 
its dimensionality, and a general purpose conputer program developed 
to assist in that analysis was described in Research Report 75-2. 

The development of the stratified adaptive (stradaptive) computerized 
test was reported in detail in Research Report 73-3. Subsequent live- 
testing research with this adaptive testing strategy (Research Report 
75-4) and computer siunilation studies {Research Report 75-6) were 
Implemented in order to evaluate its characteristics and feasibility. 

The combination of live-testing and computer simulation studies 
was continued in the investigation of the psychometric characteristics 
of other strategies of adaptive testing. Live-testing research with the 
two-stage adaptive testing strategy was reported in Research Report 
73-4, and computer simulation Studies replicating and extending those ^ 
findings are in Research Report 74-4. Research Report 75-3 presents ' 
the results of both live-testing and computer simulation studies of the' 
flexilevel strategy. Computer simulation data using a Bayesian adaptive 
testing strategy, motivated by findings of live-testing research, are 
in Research Report 76-1. Data from live testing with the pyramidal 
adaptive testing strategy are reported in Research Report 74-3- 

One study (Research Report 75-1) presents empirical data comparing 
two adaptive testing strategies. Although the original plans included 
a substantial number. of these inter-strategy comparisons, it became 
evident that it is difficult to draw clear conclusions concerning the 
compariabn of two or more strategies of testing from live-testing 
studies. Consequently, liater attempts to compare the relative effec- 
tiveness of different strategies of adaptive and/or conventional testing 
utilized computer simulation studies. Research Report 75-^ presents 
the results of a Computer simulation study comparing the psychometric 
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characteristics of a ntmber of adaptive testing strategies and a discussion 
of sooe of the problems involved in live-testing studies. 

Live-testing research on adaptive testing strategies began using 
cathode-ray tube terminals (CRTs) acoustically coupled to a tiiae- 
shared coaputer systea (Research Report 74-1). Because the charac- 
teristics of this co3&puter system did not perait research in furtherance 
of Objectives 3 and 4, in 1974 the research program began using a 
ainicoaputer. This system allowed accurate xaeasurements of testee 
response latencies and an environment permitting the study of the 
psychological effects of coxcputerized testing. 

Studies of the effects of "feedback" or knowledge of results, 
on ability test scores and testees* psychological reactions, are 
reported in Research Reports 76-3 and 76-4. A first' analysis of testee 
item response latency data is reported in Research Report 76-2. 

Major Findings 

The major findings summarized below are generally organized 
according to the original objectives of the research program. Additional 
details are in the Research Report abstracts. Many of the original 
Research Reports contain additional important findings concerning 
specific adaptive testing strategies or methodological aspects of 
research in adaptive testing. 

1. Implementation of the stratified adaptive (stradaptive) 
computerized test shows that it is a feasible approach to 
computerized adaptive testing (Research Report 73-3). 
Evaluation of the stradaptive test in comparison with other 
strategies of adaptive testing (Research Report 74-5) shows 
that it has considerable logical appeal as a result of its 
use of differential entry points, a flexible termination 
criterion which can take account of guessing, and efficient 
use of real item pools. The stradaptive test also provides^ 
scores which reflect the consistency with which a testee 
interacts with an item pool. 

2. Simulation research comparing the stradaptive test with other 
strategies of adaptive ability measurement, under one item 
pool configuration, shows that its information curve is 

the flattest of the strategies studied (Vale, in Research 
Report 75-5). Thus, of the adaptive testing strategies 
studied, the stradaptive test appears to provide the best 
.realization of the ideal of measurement with equal and high 
precision at all trait levels. 

3. Research comparing the stradaptive test with, non-adaptive 
approaches to ability testing (Research Report 75-6) shows 
that the stradaptive test provides more equiprecise measurement 
than a peaked conventional test. As item discriminations 
increased, the stradaptive test provided a greater advantage 
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in tens or equiprecision. Vblle a rectangularly distributed 
conventional test also provides equiprecise sieasureiBent, the 
level of precision is substantially lover than that of an 
otheririse cosDparable stradaptive test (Vale, in Research 
Report 75-5). 

Live-testing research with the stradaptive test (Research 
Report 75-4) shows that its consistency scores, which appear 
to reflect the diiaensionality of the interaction of an 
individual with a given it^ pool, show promise of being 
good noderator variables for the prediction of test-retest 
stability* This research showed very high test-retest- 
stabilities for a highly consistent group of individuals, 
in coaparison tp only laoderate test-retest stabilities for 
those individuals whose consistency scores on first testing: 
were lower. 

Rational comparison of adaptive testing strategies (Research 
Report 74-5) suggested that sorae of the approaches proposed 
for adaptive testing (e«g. , the Robbins-Monro procedure) 
are infeasible with real item pools. Bayesian and naximum 
likelihood approaches to adaptive testing appeared to be the 
most promising, followed by the stradaptive test and the 
pyramidal nsodels. This evaluation also suggested that the 
flexilevel test had the least logical appeal of the adaptive 
testing models proposed* 

In a computer simulation study (Vale, in Research Report 
75-5), all of the adaptive testing strategies provided more 
equiprecise measurement than did a peaked conventional test* 
All of the adaptive strategies .provided higher levels of 

average information than did a rectangular conventional ^ 

test* 

The computer simulation study (Vale, in Research Report 
75-5) also provided comparative information on the relative 
equiprecision of adaptive testing strategies. Within the 
adaptive testing strategies, the rankings of the strategies 
based on the obtained information curves were about the same 
ad those based on the previous logical evaluation of the 
strategies. Thus, the stradaptive and Bayesian strategies 
yielded the most desirable measurement characteristics and 
the flexilevel test provided the least desirable characteristics 

Computer simulation and live-testing studies (Research Report 
75-3) indicated that the flexilevel test offered little 
improvement in measurement characteristics over a conventional 
peaked test* In addition, the flexilevel test was evaluated 
as having the potential to raise negative psychological -effects 
as a result of Its branching strategy (Research Report 7^-5). 



8 



-5- 



9. In live-testing studies comparing the test-recest stabilities 
of adaptive testing strategies and peaked conventional tests 
(Research Reports 73-4, 74-3, 75-3, 75-4) when tests were 
equated for number of items and memory effects, the adaptive 
tests generally had higher test-retest stabilities than did 
the conventional test. There were no major differences 
between the test-retest stabilities resulting from the 
different adaptive testing strategies. 

10. While a Bayesian adaptive testing strategy was logically 
evaluated as a promising testing strategy (Research Report 
74-5) and yielded information, curves in one study which had 
desirable characteristics (Vale, in Research Report 75-5), 
research using different criteria of evaluation showed that 
this testing strategy has some problems which reduce its 
utility (McBride, in Research Report 75-5; Research Report 
76-1). Both live-testing and computer simulation studies 
showed that the Bayesian adaptive testing strategy studied 
yielded scores which were highly correlated with test length* 
In addition, ability estimates derived from this strategy 
were biased for two-thirds of the typical ability range. 
This testing strategy also yields scores which are dependent 
upon the characteristics of the prior ability estimate required 
by the testing strategy. 

11. A major problem in the implementation of two-stage adaptive 
tests is that of misclassification due tJ errors of measurement 
in the routing test (Research Reports 73-4, 74-4). But a 
well-designed two-stage test can provide information curves 
which are flatter, hence yielding more equiprecise measurement, 
than a peaked conventional test (Research Report 74-4; Vale, 

in Research Report 75-5). 

12. A simple pyramidal adaptive testing strategy (Research Reports 
74-3, 75-1) with a fixed step size is a promising approach 

to adaptive testing. Its major problem is that it results in 
information curves which are low for ability levels divergent 
from the mean (Vale, in Research Report 75-5). But it appears 
to provide a wider range of adequate measurement than conventional 
tests^ or the two-stage or flexilevel adaptive testing models. 
In terms of providing equiprecise measurement, however, its 
results are not as good as those of the stradaptive test or 
the Bayesian adaptive test. 

13. Implementation of adaptive testing requires the use of scoring 
methods other than simple number-correct scores such as 
average difficulty of correct responses, or difficulty of 
last item answered. Live-testing research with several 
alternative scoring methods (Research Reports 74-3, 75-4) 
shows that they provide scores with different characteristics 

,in terms of test-retest stabilities, distributional charac- 
teristics, and correlations with other variables. Computer 
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simulation »research (McBride, in Research Report 75-5) 
shows that scoring methods 'derived from latent trait 
theory differ in terms of bias, regression on ability, and 
precision. Further research is needed on the development of 
optimal scoring methods for adaptive testing. ' 

14. The evaluation 'of competing strategies of testing, including 
competing strategies of adaptive testing, is quite difficult 
(Sympson, in Research Report 75-5)- Live testing studies 
are complicated by memory effects, lack of adequate criteria, 
and non-compatability of tests due to differing test lengths 
and differing item discriminations. Live-testing studies 
comparing adaptive and conventional tests may also be complicated 
by psychological effects (Research Reports 76-3 and 76-4). 
Computer simulation studies can provide evaluations on a 
variety of criteria, but it is necessary that the computer 
simulation model be shown to adequately reflect the behavior 

of real testees (Research Reports 74-4, 75-3, 75-6). Theoret- 
ical or analytic studies are extremely limited as a means 
for evaluating adaptive testing. Because of the restrictive 
assumptions necessary to implement theoretical studies,, they 
can provide only limited conclusions which may not generalize 
to more realistic conditions. It appears that the best 
approach to evaluating competing testing strategies is a 
systematic combination of live-testing studies arid extensive 
computer simulations. 

I. / ' ^ 

15. An analysis of response latency data shows that testees 
approach different testing procedures in different ways 
(Research Report 76-2). The respdnse latency data suggest 
that these different test-taking styles and strategies might 
be potentially useful as moderator or predictor variables 

in the prediction of external criteria. 

16. Computer-administered feedback (immediate knowlege of results) 
on a conventional test appears to result in enhanced ability 
test performance for testees of all ability levels (Research 
Report 76-3). Under computer-administered feedback conditions, 
mean test scores were significantly higher for both high- 

and low-ability testees. Ninety percent of college student 
testees favorably evaluated their experience with computer- 
administered feedback (Research Report 76-4). 

17. Adaptive tests appear to be intrinsically more motivating for 
low-ability testees (Research Report 76""^) j arid result in 
higher ability estimate . (Research Report 76-3), than similarly 
administered conventional tests. This suggests that adaptive 
testing might eliminate some of the undesirable psychological 

* effects characteristic of conventional testing proceSuires, 

resulting* in fairer and more accurate test scores for testees 
who typically obtain low scores on conventional ability tests. 




18 • In a computer-administered conventional test, the provision 
of immediate feedback (Immediate knowledge of results) to 
minority group members appears to raise their ability scores 
to the levels of members of the non-minority group (Betz, 
.in Research Report 75-5) • This effect appears to be a 
motivational effect, similar to the motivational effect found 
for adaptive tests administered to other testees who typically 
obtain lower average scores on' conventional ability tests* 
'(Research Reports 76-^3 and 76-4). 

19. Computerized adaptive testing results in tests of fewer 
items, to achieve the same or higher degrees of accuracy, 
than conventional tests. Consequently, with total testing 
time fixed, the use of ' computerized adaptive testing results 
in extra time available during testing which can be used in 
the measurement of other abilities or in obtaining measurements 
of higher accuracy. More precise measurement, ou data available 
from the measurement of other abilities, combined with additional 
data available from computerized testing (e.g., response 

} consistency. Research Report 75-4; response latency. Research 
Report 76-2), might result in increased validity in the 
prediction of external criteria. ^ ^ - 

20. jThe implementation of adaptive testing on large-scale time- 
shared computer systems can be hazardous. Experience in 
testing thousands of students on a system of this. ty^e shows 
that it is too unpredictable to provide a reliable means 

of adaptive testing. In addition to frequent failures due 
to hardware, software or communications problems, computer 
response time was generally too long and -too variable. These 
characteristics combined to raise the possibility of negative 
psychological effects among testees who were being tested 
on the system. Experience ^ith a dedicated minicomputer 
system in this research program indicates that this approach 
to adaptive testing is considerably more desirable than the 
use of a large-scale time-shared computer system, with little 
^increase in cost. 

21. A preliminary cost analysis, based on the administration of 
adaptive tests by minicomputer, suggest;^ that computerized 
adaptive testing will be a financially feasible approach to 
ability testing. Given the continuing and projected decreases 
in costs of computer equipment, it is expected that the costs 
of adaptive testing will approximate those of papejc-and-pencil 
test administration and scoring within a few years. I 

Implications for Further Research | 

The findings and experience of this three-and-one-half-year research 
program support the feasibility, utility and psychometric advantages pf 
computerized adaptive ability testing. However, many new questions were 
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ralsed by the research, and some of the original questions addressed 
are still in need of further research • Portions of the research 
described below are being pursued, under a contract entitled "Computer- 
Based Adaptive Heasurement of Intellectual Capabilities", NR 150-382, 
with the Personnel and Training Research Programs of the Office of 
Naval Research. 

Branching strategies . While the research program has answered some 
preliminary questions conceming_the_T[elative utility of various branching 
strategies for use in adaptive testing, considerably more research is 
needed. The evaluations done thus far have used both live-testing and 
computer simulation studies. The results of liVe-testing studies were 
confounded by memory effects, differing item discriminations, and by 
potential psychological, effects. / 

Studies designed to equate testing strategies, for both memory 
effects and differing, item discriminations are currently in progress'. 
/ Some of these studies involve less .dependence on test-retest stability 
as an appropriate criterion for the comparative evaluation of adaptive 
testing strategies in live-testing studies. Instead, they use parallel- 
forms reliabilities as an appropriate evaluative criterion. Studies 
using test-retest stability are carefully ^designed to equate testing 
strategies for number of items administered, item discriminations, 
and memory effects. . * • 

While .the computer simulation studies have resulted in useful 
information concerning the relative performance characteristics of a 
number of adaptive testing strategies, the results to date ate limited 
in their generality. This deriv^fs from the fact that most of the computer 
simulation studies have relied heavily on information curves as an 
evaluative criterion. However, in the later phases of the research 
program it became obvious that other criteria, such as bias and the 
nature of the regression of ability estimates on true ability, are 
appropriate characteristics of adaptive testing strategies^ to be studied. 
Also, the computer simulations directly comparing adaptive testing 
strategies which have been implemented involved a restricted set of 
assumptions. It will be necessary to evaluate various adaptive testing 
strategies in terms of a variety of evaluative criteria, under a variety 
of conditions, with a carefully .constructed set of item pool configu- 
rations, and at different test lengths. Such studies are currently in 
progress. ' . 

\ 

A further extension of the research ef foft^^would include evaluation 
of some adaptive testing strategies ijot studied to date. Specifically, 
the maximum likelihood adaptive testing strategies have not been Studied 
in the present research, in comparison with other strategies. Furthermore, 
certain variations of some, of the present adaptive testing strategies 
are also in need of research. For example, further research Is necessary 
to develop termination criteria for use in the stradaptive and in the 
Bayesian testing strategies. Research is also necessary to develop 
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hybrid adaptive testing strategies ^mich cosiine the desirable features 
of several of the approaches' sttidied to date, and to study their 
psychometric characteristics. 

Scoring laethods. Research has shown that different scoring methods 
lead to ability estiaatcs^with different characteristics. Thus, the 
develop^ient of optiaal scoring aethods for adaptive testing is another 
important area for future research. Research on scoring oethods should 
Include the evaluation of the resulting ability estlaas;'es on a variety 
of criteria. In addition, the characteristics of these scoring aethods 
should *e studied In a variety of Item pool configurations using a 
nuiaiber of adaptive testing strategies. 

In additioii to evaluating- the relative performance characteristics 
of a variety of scoring methods designed for dichotomous rc — 'mse testing, 
research should also proceed in the utilization of dif fereni -^es of 
response in adaptive testing. In contrast to paper-and-pencil testing, 
coxq>uterized testing peraits an iimiedlate evaluation of the admissibility 
of each Item response made by a testee. Consequently, it is possible 
to implement both graded and continuous methods of responding to 
ability tests within the frajaeworJc of computerized adaptive testing. 
The use of these methods will result in new branching models and scoring 
aethods, and considerable research will be necessary to determine the 
utility and adequacy of the non-dichotomous response modes possible in 
computerized ability testing. Furthermore, the use of these methods 
of responding to ability tests, as*well as free-response items, might 
result in a reduction of guessing, behavior, which was found to seriously 
affect the performance characteristics of some adaptive testing strat- 
egies. Similarly, research with non-dichotomous methods of response 
might lead to the development of isethods to Hetermine yhen a testee 
is guessing in response to a given test item. 

Diiaen^onallty . The finding that individuals 'differ in their 
response consistency in adaptive tests, and that response consistency 
appears to be a moderator of test-retest stability, implies that , 
additional research needs to be done in the area of the dimensionality 
of individtjal response records. If high consistency in a stradaptive 
test is interpreted as unidimensionality of the response record, this 
suggests that low consistency reflects non-unidimensionality- Consequently, 
it should be possible to develop •measures of an individual's response 
dimensionality during the process of adaptive testing. When the computer 
recognizes that an individual is responding in a multidimensional 
fashion, as opposed to the usually assumed unidimensional model, it 
could be programmed to implement laultldimenslonal adaptive branching 
strategies for that individual. This, of course, implies -that it will 
be necessary to develop and refine multidimensional adaptive testing 
strategies to correct for intra-individual variations in dimensionality. 

The fact that many abilities are correlated to v^trying degrees 
suggests that the optimal implementation of computerized adaptive 
testing would take into account the intercorrelations among ability 
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disftensicns. Consequently, research is needed on the developaent o£ branching 
sodels vhich account for £ater-variable aultidinenslonallty. One 
approach to this prcblea which will be pursued during the next phases 
of this research program is the development of branching techniques to 
locate an individual's position in a continuous smiltidisEensional ability 
space* 

Psychological effects. Studies of the psychological effects of 
adaptive testing and 1 mediate loiowledge of results suggest that 
both are Important variables for further research. A fertile area for 
investigation is that of the difficulty of a test which will result in 
optlaal levels of saotlvation and performance for individuals. Also 
iaportant^is the role of ianedlate knowledge of results in ability 
testing. One central question is whether feedback should be system- 
atized as coaputer-adzninistered feedback or should occur as self- 
adainlstered feedback based on each testee's ability to infer the 
correctness of each itea response. Ihus, this line of research should 
result in the development of an adaptive test which administers to an 
Individual the items that will result in optiisal motivation and performance, 
as a consequence of the proportion of positive feedback that the individual 
obtains from the testing experience. 

Observations during the testing of thousands of subjects in this 
research program indicate that the characteristics of the computer 
system and the terminal displays may have differential psychological 
effects on tes tees. Consequently, it is important that research be done 
on the nature of the computer terminal which will provide least inter- 
ference with a testee's ability-testing behavior. Ihis should include 
study of the display speed of the terminal and the visual characteristics 
of the terminal display. 

The fact that large-scale time-sliared computer systems tend to have 
unpredictable response times implies the use of smaller scale, preferably 
real-time, computer systems for the implementation ?of adaptive testing. 
However, a good minicomputer system can result in ektremely fast response 
times (less than one-quarter second) following a testee's answer to a, 
given test question. Thus, an important question for research is whether 
such very fast response times will result in a tes tee feeling unneces- 
sarily "paced", with resulting increases in detrimental test anxiety 
and decreases in test-taking motivation. Research is necessary to 
determine the optimal computer response time, for the average testee and 
for specific testees. Similarly, the display speed of the terminal 
is another important factor which may differentially affect testee 
behavior. Display speeds of 10 characters per second are much too 
slow for the average testee, and display speeds of 960 characters per 
second may be too fast. Research should be designed to study the 
effects of different display speeds on testee behavior, holding constant 
other characteristics of the testing environment. 

New tests . The majority of research to date in computerized adaptive 
testing has been within the framework of inter-item branching models. 
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using awltiple-choice tests sixdlar to those used in conventicnal paper- 
aad-pencil testing. Hovever, these approaches do not fully utilize 
the capabilities of coigmter systezas. Thus, research is necessary that 
vill develop cwputerized ability tests i^hich take into account the 
unique capabilities of inter active computer equipnent. Sather than 
using sijq>le itea-to-it&i branching techniques, this research vould 
reconceptualize abili^ into an interactive problea-solving envircirment. 
The testee vould be piresented vith a problea, and he/she vould interact 
vith the coaputer to attempt to solve the problem. Ihe development of 
such nev aethods of testing will require nev aoethods for evaluating 
an individual's interaction vith the coi^ter duting the solution of 
a specified problea* These nodes of interaction vould i n clu de the 
lndividu^*s aethod of solving the probles, an evaluation of the quality 
of the final solution, and aeasurenents of the speed vith vhich the 
final solution was reached. Clearly, considerable new ground viU need 
to be btoVen in this area of interactive testing. 

Research is also needed on' nev foms of item presentation. Ihis 
includes test itei&s vhich are not in the typical multiple-choice foraoat, 
and iteas vhich are pictorial in nature. Tor both kinds of itesis, 
testees should be able to respond in non-verbal vays vhere possible, or 
in natural language. Clearly, both coi^ter hardvare and softvare 
develof^eents^ as veil as a generali^tioa of psychcaetric theory, 
vill be required to iacplement soae of these new sodes of testing. 

Similarly, the capability of saeasuring abilities vhich are now 
poorly aeasured on paper-and-pencil tests vould be a fertile area 
for further research in coEputerdzed adaptive testing. Ihese include 
the aeasurenent of sneiaory abilities, the i&easuresent of movement-based 
abilities, and the jreasurecent of decision-making abilities. Ihe 
:faiclusion of such tests in a computerized adaptive testing battery, 
possibly coisbined with existing tests based on a diiaensional troncep- 
tualization of husjan abilities, could result in substantially greater 
validity in the prediction of practical criteria than is now possible 
vith conventional paper-and-pencil ability tests. 
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Research Eeport 73-1 
Abitiiy Heaaia^neni: Ck:noentionat or Adc:piive? 
Bavid J« yeiss and Hancy 3etz 
Tebruary 1973 

Sesearch to date on adaptive (sequential, bran^ied, individualized, 
tailored, progri—f i% or res?onse-conting€nty^;^iiity testing is reviewed 
and susBarized, foUoving a brief review of problns inherent in conven- 
tional individual and group approaches to ability seasure»ent« Research 
reviewed includes eiq'irical, simulation, and theoretical studies of 
adaptive testing strategies. Adaptive strategies identified in the 
literature include tvo-stage and aoltistage tests* Multistage tests 
are differentiated into fixed-branching sodels and variable-branching 
aodels (including Bayesian and non-Bayesian strategies)* Sesults of 
research using the various strategies and research approaches are coxpared 
and suaaarized* Ihese studies lead to the general conclusion that, 
under a nuaber of circuastances, adaptive testing can considerably 
reduce tes:ping time and at the saae time yield scores of higher reliability 
and validity than conventional tests* A nuaber of nev psychometric 
probleas x^iseA by adaptive testing are discussed, as is the criterion 
problem in evaluating the utility of adaptive testing. Problems of 
iaplesenting adaptive testing vith paper and pencil or vith special 
testing machines are reviewed; the potential advantages of computer- 
controlled adaptive test adsiinistration are described. (AD 757788) 



Research Report 73-3* 
The Stratified Adaptive Ccci^uberized Abitiiy Test 
David J* Weiss 
September 1973 

y^xtie. stratified adaptive (stradaptive) test is described as a strategy 
for tailoring an abilxty test to individual differences in testee 
ability levels. Stradaptive test administration is controlled by a 
time-shared computer system. Ihe rationale of the method is described, 
which derives from Binet's strategy of ability test administration and 
findings concerning peaked tests from modeim test theory. The essential 
elements of stradaptive testing considered include the differential 
entry point, branching rules, and individualized termination criteria. 
Different mephods of scoring the stradaptive test are discussed, as 
are the i^lications of individual differences in consistency of test 
responses within the stradaptive test record. Zxaaples of the results 
of l^ve stradaptive testing are presented and discussed^ Impli.cations 
of additional data derived from stradaptive test resjKmse records are 
considered and related to other psychometric concepts* (AD 768376) 

/ 

^Research Report 73-2 was not supported by this contract. 
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WjzsezTch Seport 73-4 
An E^ri^al Stu^ of Car^uier-AdrAnistered T::^Sta§e /MZii^ Tesivng 

Nancy £. Betz and Bavid J* Veiss 
October 1973 

A two-stage adaptive test and a conventional peaked test vere coiistructed 
and administered cn a tiiae-shared cocputer systea to students in under- 
gradtiate psychology courses* Conparison of the score distributions 
showed that the two-stage test scores vere soinewhat npre variable than 
the conventional test scores. The cozsparison also showed that the 
distribution of the two-stage scores was norsal, vhereas that of the 
conventional test scores tended toward flatness* The two-stage test 
had higher test-retest stability than the conventional test when the 
effect of anenjory was considered- Ihe relationship between the two- 
stage and conventional test scores was relatively high and primarily 
linear, but it left about 20Z of the reliable variance in the conven- 
tional test scores unaccounted for. Further analyses of the two-stage 
test showed that the difficulty levels of the laeasureaent tests were 
not optixaal, and that 4 to 5Z of the testees vere misclassified into 
measurement tests. Ihe relatively poor internal consistency of the 
neasureaaent tests in cecparison to those of the routing and conventional 
tests was apparently due to the extreae hoaogeneity of ability within 
the laeasuresaent test sub-groups. Ihe findings of the study were 
interpreted as favorable to continued exploration of two-stage testing 
procedures. Suggestions for isproviag the characteristics of the two- 
stage testing strategy are offered. (AD 76S993) 



Research Report 74-1 
A Cvnputer Software St^sien for Adaptive Ability Measurement 
Louis J. DeVitt and David J. Veiss 
January 1974 

A systea of computer prograins designed to control the administration 
of adaptive ability tests was developed and used for over 2500 hours of 
ability sseasurenent. Ihe system is capable of ndninistering any cosiination 
of two testing strategies to a given individual without intemiption. ' 
Each test can be based on one of six: different testing strategies and can 
administer itenis selected fron up to nine different itea pools within 
each strategy. Ihe systen is designed to accept either laultiple-choice 
responses or free-response numeric responses. For each test and testee, 
the administrator can choose whether to give no feedback, feedback after 
each itea, or iten-by-itea feedback upon completion of testing. Ihe 
requirements of the research design and constraints of the coaputer systea 
as well as practical considerations are detailed, and their role in 
the design oft the system are discussed. Technical requireaents of the 
software systea and probleas that might arise in a transfeif of the 
software systea to another computer system are considered. Some basic 
concepts of computer progranming are developed j&s an aid to the reader 
not technically trained in coaputer. concepts. (AD 773961) 
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Hesearch Eeport 74-2 

A Vord JrjcdleSse Iten Fcol for Adaptive PJbilitjj Xeazurersent 
Jarae-s R. KcBride and David J. Veiss 
June 1974 

A series of four vocabulary noming tests vas used to develop a large 
hoDOgenecus pool of vocabulary test itens for use in corrputer-administered 
adaptive testing research. A total of 575 unique vocabulary knowledge 
Iteas vas divided aiDong four nonaing tests and adzdnistered to separate 
groups of college undergraduates. Korzaing tests were administered by 
conrputer or paper-and-pencil in fixed and randoa order. Analyses showed 
no effects due to item order or oode of adiainistration. Itea difficulty 
and discrinlnation indices of both the classical test model and the 
normal ogive iteza aK>del were derived on the noming data. On the basis of 
itea analysis results, 369 itens were selected as satisfactory for the 
adaptive testing item pool. Factor-analytic studies of subsets of the 369 
ites&s confiraed the assuaption of unidisaensionality of the selected item 
pool. On the basis of known technical limitations in the research 
and the unique problems of developing item pools for adaptive testing, 
an outline was developed for the design of future norming studies 
specifically intended to develop large homogeneous test-item pools 
for use in computer-administered adaptive ability measurement. (AD 781894) 



Research Report 74-3 
An Erzn^rical Investigation of Cariputere&'A^dirinisteTed 
Fyranidal Ability Testing 
Kevin C. Larkin and David J. Weiss 
'Zj July 1974 

Three pyramidal adaptive tests and a" conventional peaked test were 
constructed, and were administered by time-shared computer to two separate 
groups of undergraduate psychology students. Six different methods of 
scoring pyramidal tests were evaluated, witg;!]]iefipect to score distributions, 
stability, the relationship among scoring methods/^ and the relationship 
between pyramidal scoring methods and scores on the conventional test. 
For both the pyramidal tests and the conventional test, score distributions 
were platykurtic and positively skewed. Two methods of scoring the 
pyramidal tests consistently used an equal or greater proportion of 
the range of possible scores than the conventional test. The 15-stage 
pyramidal tests showed test-retest correlations only slightly lower 
than those for the 40-item conventional test. However, when the effects 
of -memory were considered, the pyramidal strategy yielded more stable 
ability estimates than conventional tests of equivalent length. The 
correlation between pyramidal and conventional test scores ranged from' 
.82 to .86 depending on the scoring method used. One pair of scoring 
aethods was found to be perfectly correlated for properly constructed 
pyramidal tests; a second pair correlated almost perfectly. Findings 
were generally in favor of pyramidal testings but further investigation 
of "this adaptive tes^iing strategy is necessary to determine its other 
important psychometric characteristics and to develop optimal rules 
for constructing pyramidal item structures. (AD 783553) 
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Eesearch Report 74-4 
Sir2ilatian Studies of T:Do-Stase Ability Testing 
Nancy £« Betz and David J« Weiss 

Occober 1974 , 
Hante carlo siisulation procedures vere used to study the psychoaietric * 
characteristics of two two-stage adaptive tests and a conventional 
"peaked" ability test* Results showed that scores* yielded by both 
two-stage tests better reflected the nomsal destribution of underlying 
ability. Ability estissates froa one of the two-stage tests were more 
reliable and had a slightly higher relationship to underlying ability than 
did the conventional test scores. One of the two-stage tests yielded 
an approxiaately horizontal inforaation fimction, indicating more 
constant precision of aeasurenent for individuals at all ability levels • 
The conventional test and the second two-stage test yielded information 
functions peaked at the aean ability level but dropping off at more 
extreiBe levels of ability; however, the second two-stage test provided 
nore inforsation than the conventional test at all levels of ability. 
The findings of the study were interpreted as indicating the potential 
superiority of two-stage tests in comparison to conventional tests. 
Several Icprovecents in the construction of two-stage tests are suggested 
for further research. (AD A001230) 



Research Report 74-5 
Strategies of Adaptive Ability Measurement 
David J. Weiss 
December 1974 

A nuEiber of strategies are described for adapting ability test items 
to individual differences in ability levels of testees. Each strategy 
consists of a different set of rules for selecting the sequence of test 
itemSy to be administered to a given testee. Advantages and disadvantages 
of each strategy are discussed, and research issues unique to the strategy 
are described. Strategies reviewed are differentiated into two-stage 
and multistage approaches. Several variations of the two-stage approach 
are described. Multistage strategies include fixed-branching and variable- 
branching strategies. Fixed-branching strategies reviewed include a 
number of variations of the pyramidal approach (e.g., constant step 
size pyramids, decreasing step size pyramids, truncated pyramids, 
multiple-item pyramids), the flexilevel test, and the stradaptive test. 
Variable-branching approaches include two Bayesian strategies and two 
maximum likelihood strategies. The various strategies are compared with 
each other on important cliaracteristics and on practical considerations, 
and are ranked on their apparent potential for providing equally precise 
measurement at all ability levels. (\D A004270) 



Research Report 75-1 
An Empirical Comparison of Tm-^tage and 
Fyramidal Adaptive Ability Testing 
Kevin C. Larkin and David J* Weiss 

February 1975 \^ 
A 15-stage pyramidal test and a 40-ite« two-stage test wete_ constructed 
and administered by computer to 111 college undergraduates. The two-stage 
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test was found to utilize a ssaller proportion of its potential score 
range than the pyranidal test* Score distributions for both tests 
were positively slcewed but not significantly different fro« the nonnal 
distribution. The pyramidal test's score distributions tended to be 
platykurtic while the tworstage test's distribution tended to be leptokurtic. 
Ihe assignaent of subjects to neasurea^nt subtests in the two-stage 
test was more accurate than In a previous eaipirical Investigation, 
since the Misclassif ication rate was less than IZ* Comparison of 
scoring aethods for the pyraaidal strategy supported earlier findings 
that the average difficulty scoring aethods were aost useful. The 
^correlations between scores on the two adaptive strategies ranged from 
r".79 to .84. Both adaptive strategies appeared to adapt itea dif- 
ficulties to individual ability differences so as to reduce chance 
effects due to guessing. The pyramidal strategy seeaed to be slightly 
aore successful In ellainating guessing than the two-stage strategy. 
Results are discussed with respect to internal consistency reliabilities, 
stabilities, and the relation of each strategy to conventional testing. 
Similation studies are suggested to further delineate the optimuia 
characteristics of each testing strategy. (AD A0667733) 



Research Report 75-2 ♦ 
TETREST: A FORTMS IV Program for 
(kiZculating Tetvachoric Correlations 
James R. KcBride and Bavid J. Weiss 
Pebruary 1975 

A general purpose computer' program for the calculation of a matrix of 

tetrachoric correlations is described. This program was developed for 

use in adaptive (and other) t:esting research to examine the unidimen- 

sionality assumption in latent trait theory, in conjunction with available 

factor analysis programs. Several other^potential applications and 

details for its use are described. The program accepts as input raw 

dichotomous data, reduced joint frequency data, or joint and marginal 

proportions, for up to 75 items. Output options include the tetrachoric 

correlatioa matrix, the matrix of phi coefficients, fourfold frequency 

tables for every item pair, a joint frequency matrix (which reduces 

all the information in the fourfold tables to a square matrix with order 

equal to the number of items), and a pair-by-pair listing of Input 

proportions and output correlations that permits testing the program 

against published tables of the tetrachoric correlation. Variable input 

and output formatting makes the program convenient to use in conjunction 

with other analyses by paclcaged statistical programs. Examples of input 

and output are presented. A complete FORTRAN IV listing is included. 

(AD A007572) " , . 



Research Report 75-3 
Brpirlcat and Simutatum Studies of Flexilevel Ability Testing 
Nancy E. Betz and David J« Ifeiss 
July 1975 

A 40-itcB flexilevel test and a -40-itca conventional test were coapared, 
using data obtained through 1) coBputer-adainistration of the two tests to 
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three groups of college students, and 2) aonte carlo sisaulation of test 
response patterns. Results indicated that the flexilevel score distribution 
better reflected the underlying noraal distribution of ability, and that 
the flexilevel test had a higher parallel-foras reliability and a higher 
relatioxiship to underlying ability level than did the conventional test* 
Ihe overall test-re test stability of the two tests was equivalent, but 
there was evidence Indicating that aeaiory effects inflated the stability 
of the flexilevel test scores less than that of conventional test scores. 
The flexilevel test provided oore accurate neasureaent at alnost all 
ability levels, although its information function was similar In shape 
to that of the conventional test. However, the interpretation of differences 
^In the level of inforaation provided were confounded by differences i|i 
the average discristinating power of the iteas in the two tests. The 
flexilevel test also appeared to reduce randoa guessing behavior in 
conparlson to the conventional test. (AD A013185) f' 



Research Report 75-4 
A Study of Computev-'Adminisieved StradapHve Ability Testing 
C. David Vale and David J. Weiss 
October 1975 

A conventional vocabulary test and two forms of a stradaptive vocabulary 
test were administered by a time-shared computer system to undergraduate 
college students. The two stradaptive tests differed in that one counted 
question-nark responses (i.e., daitted items) as incorrect and the other 
ignored items responded to wi^h question marks. Stradaptive test scores 
were more consistent with the 'hypothesized -jiature of the population 
distribution of verbal ability. When corrected for differing levels of 
item discrimination and memory effects, t;he test-retest stabilities of 

^\he two testing strategies were about equal. Scores on one form of the 
stradaptive test were found to be very stable for testees with highly 
consistent response records on initial testing. Stability of "subject 
characteristic curve" data was high, suggesting the usefulness of these 
data for describing test-testee interactions. Of the ten stradaptive 
ability scores studied, which grouped into four clusters, average 
difficulty scores had the highest stabilities. Analysis of difficulties 
of items associated with correct, incorrect, and question-nark response^ 
suggested that items with question mark responses should not be Ignored, 
but should be treated as incorrect responses in branching decisions. 
Suggestions for future research on the stradaptive testing model are 

.made. (AD A018758) r ^ 



Research Report 75-5 \. 
ComputeHzed Adaptive Trait Measuremnt: Frohtem and^^speete 
Nancy E. Betz, James R. McBride, Jaaes B. Syapson andX David Vale 
with contributions by R. Darrell Bock and Robert Linn 
Edited by David J. Weiss . , / 

Moveaber 1975. 

This report pres/ents the^ proceedings of a syaposiua presented at the Annual 
Convention of the Aaerican Psychological Association, August 30, 1975. 
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The syaposium consisted of four papers and the cements of two discussants* 

1. C. David Vale. Vrdbtem: Strategies of Branching through an 

Item Foot. 

This paper describes a variety of strategies for adapting tests 
to the trait level of each individual on the basis of the testee's 
responses to previously administered ite«s. Based on data from 
coaputer siaulations^ the various strategies are cospared in 
tents of the levels and shapes of Inforaation curves they provide 
under one particular set of conditions. Idnitations of the 
data presented are discussed* 

2. Jaaes R. McBride. Problem: Saoring Adaptive Tests. 
Several approaches to scoring adaptive tests are described. 
Inapplicability of traditional nuaber-correcfc scores in adaptive 
testing, viiere different Individuals answer different itens, 

is discussed. Ihe essentials of latent trait theory are 
sunarlzed, and two scoring methods usable with that approach are 
explicated. These scoring methods — naxlaum likelihood scoring and 
Bayesian scoring — are compared using simulation data, on 
criteria of infonsation, bias, and regression on ability* 
Limitations of these scoring methods are discussed. 

3* James B* Syapson. Problem: Evaluating the Results of Adaptive 
Testing. 

Six component elements of a testing procedure are described; 
it is suggested that proper evaluation of a testing procedure 
should be based on consideration of these elements as separable 
components* Classes of criteria for evaluating a testing 
procedure are differentiated into validating criteria, theoretical 
criteria, psycho-social criteria, and cost criteria* Within 
each of these categories, the various criteria are discussed 
and contrasted. Suggestions are made for the appropriate 
applications of each of these criteria. The problem of using 
multiple criteria is briefly discussed;- and it is suggested 
that live-testing and simulation research be systematically 
- combined* A number ^£jsp.ficifiaJCecofflmendat ions ^re made concerning 

problems of evaluating the results of adaptive testing* 

4* Nancy E* Betz. Prospects: Vev> Types of Information and 
Psychological Implications. f 
Several types of new infoirmation available from computerilzred 
adaptive measurement are de8cribed« These include indlvidtialized 
error of measurement, response consistency. Improved response 
modes, response latencies, and new kinds of tests* Data from 
live computerized testing are presented showing that response 
consistency moderates test-rctest reliability* The potential 
psychological advantages of computerized tasting are discussed « 
Data are presented from two studies demonstrating the facilitating 
effect of immediate knowledge of results after each test item on 
ability test performance* 
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/ CoBcnts by the discussants, Robert L. Linn of the University of Illinois 
/ and R. Darrell 3ock of the University of Chicago , include a discussion 
\^ of sow of the liaitations of the research presented, some differing 
^^interpretatx ns, and suggestions for future research in adaptive testing. 
(AD A018675) 



Research Report 75-6 
A Simulation Study of Stradaptive Ability Testing , 
€• David Vale and David 3» Weiss 
Deceitber 1975 

A conventional test and two forms of a stradaptive test were administered 
to thousands of simulated subjects by slnlcc^uter. Characteristics of the 
three tests using several scoring techniques were investigated while 
varying the discriminating power of the items, the lengths" of the tests, 
and the availability of prior Information about the testee*s ability 
level* The tests were evaluated in terms of their correlations with 
underlying ability, the amount of information they provided about ability, 
and the equipredsion of measurement they exhibited • Major findings 
were 1) scora^on the conventional test correlated progressively less 
with ability^%itea discriminating power was increased beyond o-l.O; 
2) the conventional test provided Increasingly poorer equipredsion of 
measurement as items became more discriminating; 3) these undesirable 
characteristics were not characteristic of scores on the stradaptive test; 
4) the stradaptive test provided higher score-ability correlations than the 
conventional test when item dlscriiainatlons were high; 5) the stradaptive 
test provided more information and better equipredsion of measurement 
than the conventional test when test lengths and item discriminations 
were the same for the two strategies; 6) the use of valid prior ability 
estimates by stradaptive strategies resulted in scores which had better 
measurement characteristics than scores derived from a fixed entry point; 
7) a Bayeslan scoring technique implemented within the stradaptive testing 
strategy provided scores with good measurement characteristics; and 8) 
further research is necessary to develop improved flexible termination 
criteria for the stradaptive* test. (AD A020961) 



Research Renort 76-1 
Some Fropertiee of a Bayesian Adaptive Ability Testing Strategy 
/ James R. McBrlde and David J* Weiss 

March 1976 

Four monte carlo simulation studies were conducted of Oven's Bayeslan 
sequential procedure for adaptive ability testing. lAiereas previoxis 
simulation studies of this procedure have concentrated on evaluating it 
in terms of the correlation of its test scores with simulated ability ±a a 
normal population, these four studies explored a nujicr of additional 
properties, both in a rormally distributed population and in a distribution- 
free context. Study 1 replicated previous studies with finite item pools, 
but exmined such properties as the bias of estimate, mean absolute error, 
and'correlation of test length wltt ability. Studies 2 and 3^ examined the 
sam^r variables in a number of hypothetical infinite item pools, investigating 
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the effects of item discritainating power, guessing, and variable vs. 
fixed test length. Study 4 Investigated sozae properties of the Bayesian 
test scores as latent trait estimators, under three different configurations 
(regressions of item discrliaination on item difficulty) of i tea pools. 
The properties of interest included the regression of latent trait estimates 
on actual trait levels, the conditional bias of such estixaates, the 
information curve of the trait estimates, and the relationship of test 
length to ability level. The result^ of these studies indicated that 
the ability estiinates derived froia the Bayesian testing strategy were 
highly correlated with ability level. However, the ability estimates 
were also higjily correlated with nucber of items administered, were non- 
linear ly biased, and provided measurements which were not of equal 
precision at all levels of ability. 



Research Report 76-2 
Effects of Time-Liniits on Test-Taking BeJiavior 
T.W. Miller and David J. Weiss 
April 1976 

Three related experimental studies analyzed rate and accuracy of test 
response under time-limit and no-time-limit conditions. Test instructions 
and multiple-choice vocabulary items were administered by computer. 
Student volunteers received monetary rewards under both testing conditions. 
In the first study college students were blocked into high- and low- 
ability groups pn the basis of pretest scores. Results for both ability 
groups shoved higher response rates under time-limit conditions than 
under no- time-limit conditions. There were no significant differences 
between time-limit and no*time-limit accuracy scores. Similar results 
were obtained in a second study in which each student received both 
time-limit and no-time-limit conditions. In a third study each tes tee 
received the same testing condition twice and higlTet response rates were 
observed under the time-limit condition; response accuracy remained 
consistent across testing conditions. All three studies showed essentially 
zero correlations between response rate and response accuracy. Response 
latency ^ata were also analyzed in the three studies. These data suggested 
the existence of different test-taking styles and strategies under time- 
limit and no-time-limit testing conditions. The results of these studies 
suggest that number-correQt scores from time-limit tests are a complex 
function of response rate, response accuracy, test-taking style and 
test-taking strategy*, and therefore are not likely to be as valid or 
useful as number-correct scores from no-time-limit tests. 



Research Report 76-3 
Effects of Jjmiediate UnoiDledge of Results and Adaptive Testing on 

, Jbility Test Vevfovmance 
Nancy E. Betz and David J. Weis^ 
• Hay 1976 

This study investigated the effects of inaiedlate knowledge of results 
(KR) concerning the correctness or Incorrectness of each item response 
on a coaputer-adAinlstered test of verbal ability. The effects of KR 
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were exaiained on a SO-itea conv^entlonal test and a stradaptive ability 
test and in high- and low-ability groups. The dependent variable was 
^yiirnxm likcllhood ability estlnates derived frou the itea responses. 
Results indicated that aean test scores for the higji-ability group receiving 
KR were higher than for the no-XR group on both the conventional and 
stradaptive tests. For low-ability exasiinees, laean scores were higher 
under KR conditions than under no-KR conditions on both tests, but the 
difference was statistically significant only for the conventional test. 
However, the higher aean scores of the low-ability testees on the stradaptive 
test indicated that for low-ability examinees, adaptive testing had the 
saae incentive effects as did the provision of imediate KR. Knowledge 
of results did not have significant effects on either response consistency 
on the stradaptive test or: response latencies, and neither the shapes 
of the resulting test score distributions nor the internal consistency 
reliability of the conventional test differed consistently as a function 
of KR conditions. No significant scor^ differences ware found on a 44- 
iteo post-test adadnistered without KR, indicating that the facUitative 
effects of knowledge of results on test performance were confined to the 
test in which KR was provided. The results of the study were Interpreted 
as indicating the potential of both iiamediace knowledge of results and 
adaptive testing procedures to increase the extent to which ability 
tests measure the "zaaximim perfonaance" capabilities of each individual. 



Research Report 76-4 
Psychological Effects of Jbrmediate Kncrjledge of 
Results and Adaptive AbiHiy Testing 
Nancy E. Betz and David J. Weiss 
Hay 1976 

This study investigated the effects of providing iigmediaCe knowledge 
of results (KR) and adaptive testing on test anxiety and test-taking 
motivation. Also studied was the accuracy of student perceptions- of their 
test performance on adaptive and conventional tests administered with or 
without laanediate knowledge of results. Testees were 350 college students 
divided into high- and low-ability groups and randomly assigned to one of 
four test strategies by KR conditions. The ability level of examinees was 
found to be related to their reported levels of motivation and to dif- 
ferences in reported motivation under the different testing conditions. 
Low-ability examinees reported significantly higher levels of motivation 
on the stradaptive test than on the conventional test, while the reported 
motivation of high-ability examinees did not differ as a function of testing 
strategies* The effect of knowledge of results on reported motivation also 
differed as a function of ability level. Low-ability testees reported 
lower motivation under KR conditions than under no-KR, while higher ability 
testees reported higher motivation with KR. Analysis of the anxiety 
data Indicated that students reported significantly higher levels of 
anxiety on the stradaptive test than on the conventional test. The 
provision of KR did not result in signifi^cant differences in reported 
anxiety. However, highest levels of anxiety were reported by the low- 
abliity group on the stradapCive test administered with KR. TThese results, 
in conjunction with previously reported data on effects of KR on ability 
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test perforaance, were interpreted as being the result of facilitative 
anxiety. Over 90Z of the students reacted favorably to the provision 
of ijnediate knowledge of results. Students were able to perceive their 
levels of test performance with soae accuracy. However, perceptions 
of the relative degree of test difficulty were mich aore closely related 
to actual test score on the conventional test than on the stradaptive 
test. Thus, it appears that adaptive testing creates a psychological 
environaent for testing which is more equivalently reinforcing or encouragipg 
, for examinees of all ability levels* 




t:smB:nitm usr 



4 Cr. Ktrshall J. Firr, CSmtor 

^er«»»cl asd 7rainio9 ^etrcla Prosraaj 
Cfffcc of iUv*l ^etixh {Code <SS) 
ArlfcgtOD. YA 22217 

1 CNR Branch Office 

49S Stmer Street ^ 
Boston, M C2210 ^ 
ATIX: Or- Jtaes tester 

1 CKi Srandh Office ' 
1030 £*st Green Street 
PMUdm/CA 91101 * 
ATW: Dr. Eugene Clcarie 

1 OSSk BrMh Office 
536 South Hart Street 
Chic«90. IL 60605 
ATTM: Or- Charles £. Owls 

1 Or, K. A. Sertfn. Sciestific Director . 
Office of lieval ;!esetrch 
Scientific Liaison Group/ToVyo 
Aaerlcan Etixssy 
APO San Francisco 96503 



1 CoaR&nd!o3 Officer 

Iteval Health ?^se2rt^ Cester 
San Di^, CA 92352 
ATjy: Llhraiy 

I Chafnsaa 

Sehayioral Science De:^rti%st 
JS«r*l CoMMDd t Jfenejecent OivSslOfl 
U.S. Kiral Acadeny 
Annapolis. fO 2l«i2 

1 Chief of attral l€tXMtion 1 Training 
Xival Air Statioo 
fensacola. a 3^36 < 
ATTH: CAPI Brace Stboe. USJi 

1 Mr- Arnold I. I^insteifi 

H«n Siesocrces Projrati y^u^gtr 
Xiral Xiterfal Comand (02^) 
3^ lOM, Crrstal Plaza iS 
l^sjiin^tcn. DC 2036D 

^ Dr. Jack Borstlng 

U. S. Xaval Postgraduate School 
Departaent of Operations P^scarch 
J53oterey, CA 939<D 



I Cr. H. Mlace Sicafko 
c/o Office of xaval Pesearch 
Code 4^0 

ArllriSton, YA 22217 

6 ^Director 

fCaval Research tahorat^Vy 
Code 2627 

Uishlngton. DC 20390 

1 Technical Director 
iiavy Personal Research 
and Development Center 
San D}e90. CA 92152 

1 Assistant for Research Liaison 
Bureau of Xaval Personnel {Pers Or) 
Kooa 1416. Arlin9ton Annex 
Sfashlngton, DC 20370 

1 Assistant Deputy Chief of Kaval 
Personnel for Retention Analysis 
and Coordination (Pers 12) 
KooA 2403, Arlington Annex 
Washington, DC 20370 

1 LCDR Charles J. Theisen, Jr«. KSC. USK. 
4024 

JIaval Air Development Center 
Vancfnster. PA 3E974 



Or. Lee Killer 

Xaval Air Systems Corcand 

AiR-413c 

Washington, DC 20361 



COR Paul 0. Kelson, MSC, OSff 
Kaval >4edlcal RXO Cosmnd (Code 44) 
National )Uval Hedlcal Center 
Ztthesd^. m 20014 



1 Director, Kavy Occupational Task 

Analysis Prograta (lOTAP) 
tUry Personnel Program Support 

Activity 
Building 13D4. Soiling AFB 
Washington. DC 20336 

1 Office of Civilian Kanpower Kanageoent 
Code 64 

Washington. DC ^390 
ATTK: Dr. Richard 4- lilchaus 

1 Office of Civilian ^npower itenagecent 
Code 263 

Washington. tC 20390 

1 Chief of Kaval Reserve 
Code 3955 

Kew Orleans. U 70146 



1 Assistant to the Assistant Deputy Chief 
of Kaval Operations (Kanpot^r) 
Head. NMPS Project Office 
Room 1606» Arlington Annex 
Washington. DC 2037G 
ATTK: Dr, Kany West 

1 Superintendent 

Naval Postgraduate School 
Monterey. CA 93940 
AHIC: Library (Code 2124} 

1 Cownder. Ravy Pecrulting Cosmnd 
4015 Wilson Boulevard 
Arlington, V* 22203 
ATTK: Cede 015 

1 Kr« George N. Gralne 
;(aval Sea Systes^ CoiTiand 
SEA 047C12 

WashlngtOftr-OC— =20262- 



27 



1 Mtf of Kiral Techoictl TreSnfog 
Hcval Air Stttioo Ka^is {75) 
XI11io9toa. IX 38DS4 

Or. iforma l^err 

1 ^Hnclpal Cfylliaa AdvSsor 
for E^ucitioa JJid Trsinlog 
Txifoing CoMsd. Co6t 00k 
?eas*co1a. a 33506 
ATtU: Dr. VinijEs L. Malpx 

1 Director 

7r«iniog Analysis S Evil^tloxi Sr^tsp 
Code 9f*^t 

Deptrtsefit cf fbt ^iavy 
Orlsddo* a 32813 
ATTI: Dr. Alfred F. S^oiSc 

1 Oilef of lUyal Training Support 

Code 1^-21 

Ssi1di»9 45 
a- Mairal Air Station 



1 Xavj fersannel ?^earcb 
a^ Oerelppaent tester 
Code 90<! 

San S^lejD. CA 92152 
ATTK: Or. J. 0. fletcbcr 

I D. JL Crapg. CAPI. IC. USH 

Head, Educational Prograas Oevelc^mt 

2)epart3«st 
JCaral Bealtfi Sciences Eds^tion end 

Trafoioj Cbnnaod 
3et>ifsdfi. K) '2a)K 

££21 



?ensaco1a. a 3250S 

1 yaval Undersea Center 
Code 6032 

S&fi Die90. a 92132 
ATTH: S*. Cary Thcasoo 

1 iXm C. F. lo9*fl. VUi 
F-14 fUnageiaer.t Systen 
CCJfSFJIAEW-lSI^AC 

KAS Hir2n2r, CA S2145 

? Jiavy Fersonnel fieseardi 
and Develppnent Center 
Code 01 

San Oiego. a 52152 

5 xavy fersonnel Kese/rch 
end Oevelopnent Center 
Code 02 

San Oiego. CA 92152 
ATTK: A. A. Sjobolo 

2 Kavy Personnel Research 

and Cevelopcaent Centr 
Code 304 

San Diego. CA 92152 " 
AUK: Dr. Oohn ford 

2 Kavy Personnel Research 
and Deve1op=sent Center 
Code 310 

Ur Diego. CA 92152 
AHN: Dr. Kartin F. Wiskoff 

2 XitvY fersoancl Stesearch 
3Dd Develops^nt C<;»ter 
C^e 309 

Sjn Dlexo. CA 92152 
ATO: Or. C. Cory 

1 Xavy Personnel Research 
and Develcpoent Center 
San Diego. CA 92152 
ATTK: Library 



1 Technical Director 

U.S. Any Search Institute for t^e 

S^rioral and Social Sciences 
1300 idlsoo Soslera.id 
Arlington. TA 22209 

1 Head^tfcrters 

U.S. Anrr Adciaistration Center 
Ferso»ne1 Adainistr^tion Coc6at 

Derelopiieat Activity 
ATCP*»q 

Ft. ^enjaain Harrison. IK 45249 

1 Anaed Forces Staff College 
JtorfoU. YA 23511 
AITK: tibrany 

1 Connsndant 

U.S. Am^y Infantfy School 
Fort i>enning. €A 31905 
ATTK: ATSH'DET 

1 Deputy Cowttnder 

U.S. Ane/ Institute of Adcinistration 
Fort BenjaxiK Harrison, IK 46216 
ATTK; EA 

1 Or. Stanley Coben 

U.S. Arcy Jescarch Institute for the 

S^yforel and Social Sciences 
13D0 Vilson Boslevard 
Arlington. YA 22209 

1 Dr. Ralph Dusek 

U.S. Arcy Research Institute for the 

Eehavioral and Social Sciences 
1300 Wilson Boulevard 
Arlington. YA J2209 

1 Dr. Leon K. Kairrocki 

U.S. Aray Research Institute for the 

eehavioral and Social Sciences 
1300 Vilson Boulevard 
Arlington. VA 22209 

1 Dr. Joseph Hard 

U.S. Ar«y Research Institute for the 

Sehavicral and Social Sciences 
1300 Wilson Boulevard 
Arlington. VA 22209 



1 HQ iSfStVk A 7th Anry 
OXSDRS 

USAK2» Director of GO) 
APO Heif Tori - C54D3 

1 ARI Field Onlt - leavcrtorth 
Jbst Office tax 3122 
i^rt Lcavemorth. 13 5S027 

Kr. ^anes SaVer 

O.S. A>irjr F^esearch Institute for the 

U^tioTMl end Social Sciences 
13:^ Vilson Boulevard 
Arlington. YA 22209 

Dr. Jases L. ^aney 
U.S. Arxy P^earch Institute for the 

Behavioral end Social Sciences 
-1300 Vilson Boulevard 
Arlif^ton. YA 22209 

1 Dr. Kilton S. JUtz. Chief 

IndlvidMl Training A Perforrcance 

Evaluation 
U.S. Anry Research Institute for the 

Behavioral «nd ^ial Sciences 
1300 Vilson Boulevard 
Arafngtoa. YA 22209 

Air Force 

1 Research Brand) 
AF/DFKYAR 

Randolf^ AF3. TX 73148 

1 Dr. C. A. Eclstrend (AFia/ASTj 
Vright-Patterson AFB 
Ohio 45433 

1 AFHRI/D3JK 
Stop #63 

Laclcland AFB. TX 7S235 

1 Dr. «artin P^faay (AFKRL/TT) 
Lowy AFB 
Colorado B023D 

1 Dr. Alfred R. Fregly 
AFOSRANL 

1400 Vilson Boulevard 
Arlington. YA 22209 



1 AFHRL/PED 
Stop #53 
Uckland^^^S. TX 



78236 



1 Kajor Viyne S. Sellan 
Chief of Personnel Testing 
HQ USAf/OJWP ^ 
Randolph AFB. TX 7B148 ^ 



ERIC 



2^ 



Kirfpc Com 



I CSrector. Cff f ce of tes?a*r ^ 
Clllizitfos 
rl«5rJ*rtcrs, Serine tops ICoSfi Kl) 
JO (Bdlldffi^ 2339) 

1 Dr. A- I- SUflosl^r 

Hfidguiners, U.S* Uterine Corps 
VAShffi^tQS, DC 

1 Chief, fcA^acic Orpt rtn e at 
education Ctsittr 
K£iifie Corps Oevelopeest ftsd 

£d:>Cflt!on Consand I 
Xtrl^ Corps Sise 
Ciastfco. TA 22134 

I llr. £. A. Ooyer 

2711 SoHda IfeftiA Strict 
Arl^ugtoa, VA 22205 

1 

I Kr, ^sepfa^J- Cciaa, Chief 

U.S. Coist Suird KeedT'^erters 
Sresbfnston, £C 20593 

1 

Otfeer DDD 

1 Ury Assistant for Hjau ;icso:»ice$ 
Office of the Secretary of Oefecse 
P^oon 33129. feflUgon 

liishfGSton. OC 20333 * 

1 i:r. Harold r. O'Kell. Jr. 

Advanced Eeseirch Projects Agency 
Huaao ?^soarc« Research Office 
KOO VlUon Boulevard 
Arlingtwj, VA 22209 

1 Sr. Kobert Yoanj 

Advanced Research Projects Agency * 
Husin Resources Seseardi Office 
1400 Vflson Boulevard 
Arlington. YA 22209 

1 Kr. frederick V. Stiffa 

Chief. Recruitins and Retention Evalyatioa 
Office of the Assistant Secreury of , 

Defense. n?A) 
Room 33970. Pentagon 
yashinglon. DC 20301 



Cl*gr torc«ggnt 

1 Dr. lorrafoe 0. Cyde 
' Personnel £cscardb «ad Dnrelspnect 
Center 

O.S. Civil Serrice CswrfssSoa 
1930 £ StJTeet. JC.V. 
Ka$Jj5f5ton. DC 23415 

1 Or. Villfas €ojfi»6# Ciredw 
Personnel SicscardJ *nd Oere3:^Baest 
Center 

5J.S. Cirll S^nrict Coanrlssfoa 
1933 £ Street. 
«ashi2)ston. DC 23415 

1 Or. Tern ^Jyj^r 

Personnel P^earch ^nd Developcent 
Center 

U.S. Cfril Serrice Comdssion 
^ 1903 £ Street. K-V- 
iJashfoston.-DC 2DC15 

1 Or. Karol^ T. Tafer 

Perstmnel P^earch 4nd Oevclcpsent 
Center 

U.S. Civil Service CootrfisSon 
1903 £ Stwt. S.«. 
^ashirtgton. DC 20415 

Or. Siidiard C. Atliinson 
Deputy Director 
|£atio&e1 Science foundation 
1800 G Street. K.«..\ 
«asfofn5ton, DC 20553 

Or- Andrw R. Jtolnar 
Te^mologlcal lnnov^ions in 

£<5ucation Group 
5Iatforal Science foundation 
1&03 0 Street. K.V. 
Washington, DC 20550 



12 Defense Oocuaentation Center 
Caseron STtation, Building 5 
Alexandria. VA 22314 
AUHi TC 



U.S. Civil Service Cooeission 
federal Office Building 
Chicago Regional Staff Division 
Regional Psychologist 
230 South Dearl>om Street 
Cbica^. !L 50604 
ATIK: C. S. Vinlewicz 

St. Carl fredeciksen 

teaming Division, Basic Skills Sroup 

Kitional Institute of Education 

1200 ISth Street.K.W. 

Vashington. DC 2021^ 



1 



Miscellaneous 

1 Dr. Scarvia 3. Anderson 
' EdjcatioAal Testing Service 

17 Executive Parte Drive, K.E- 

AtlanU, » 30329 



1 



1 Or. Ztiha Anaett 

SeparUaect of Psyc5iGlosy 
The Ooiversity of Vanr-a: 
Coveutjy ar47AL 

I Or. Ser«1d Y. Barrett 
University of Akron 
Oepartaect cf Hj^iolosy 
Akron. 09 44325 

\ Dr. Sercard 9L Bass 
Oniversity of Rochester 
Sraduate School of Kanagcnent 
Rochester. VT 14527 

I Ctntsry Research Corporation 
4113 lee aighway 
Arlsa^ton. ¥A 22207 

I Dr. r*nneth £. Hart 
Csiversi^ of Rochester 
College of Arts *Bd Sciences 
River Caiapas Station 
P^ester, AT 14527 

Or. Iioman Cliff 

University of Southern Callfomii 
Oepartaent of Psydology 
University Par* 
tos Angeles. CA S3337 

Dr. Allan K Collins 
Bolt Eeranek and liei^san, Idc. 
53 ^Soulton Street 
Cartridge. M 02133 

Dr. P^ne* V. Davis 
University of Minnesota 
Department of Psychology 
Kinneapo1is» M 55455 

Dr. Ruth Day 
Tale University 
Departiaent of Psychology 
2 Hillhouse Avenue 
hew Haven, a 06520 

Or. Iiomn R. Dixon 
203 South Craig Street 
University of Pittsburgh 
Pittsburgh. PA 15250^ 

Or. Karvin 0. Ounnette 
University of Xlnnesot£ 
Departnent of Psychology 
Minneapolis, Iff 55455 

ERIC 

Processing and Reftrtnct facility 
4333 P^gby Avenue 
eethesda» tO ZCOU 



ERIC 



29 



1 Or. nctcr fields 
nsaX^Docry CotHeje 
Ztptrtsstst of ?Sycbc1^ 

r-j1iers5 tjr of CillforoSa 
Sra^jjte Sibool ef Adainistfatfoa 
Irv5&e. £A 92664 

I Pr, 5tol)frt Cliser. Co^Jrector 
ikiSifjtrsf ty of PlttsSttrr^h 

rittstefSfb, PA 15213 

1 Kr. HJror H. KanttS 

$d4«*tioa*l Test! f>9 Scnrlce 
PrancctOfl. 30 CSS40 

1 Or. S5c!i2rd S. Hatch 

(Decision Sjrstenes Associates. Inc. 

1 !Dr. 0. Kavron 

"Usia Sciences Rtsearcfe, luc. 
Jjao Old Sprisjg Hoyse 5?oad 
::nt ^tc I^dystrial 7ari 
rcUftfl, YA 22301 

I T'^jisJrftO Central SsvSsSon 
433 Plaza S-jildSftS 
?acc SDalcvand at tairfMd trivt 
fcasecola, R 32S05 

1 Hunft^/Kestem Division 
27857 Benrldc Drive 
Cansicl. a 93921 
AIW: library 

1 Hua^a Central Division/Col uRto Office 
Suite 23, 2601 Cross Country Drive 
Coluiit:us. a 319D6 

1 hUBt(K> 

Jose;^ A. Austin Building 
1939 Goldsiilth lane 
~lxmisville. Xt 40218 



1 Dr. Frederia: K. Lord 
£dycat1ocal Testis^ Unripe 
Frfocetoe. SS 08olO 

1 Dr.'&berta. Madcie 

S»su> Factors Hesearcb, lac. 
6783 Corton Drive 
S&Qta Sartara Xeseard) l^aii: 
Goleta. a 93017 

1 Dr. yiiaiaio C. J(ax» 

Uaiversfty of Sosthem ^llforoia 
Ififonatioo Sciences losti^Tte 
<S76 AdEiraltj Viy 
KaHsa De9 fiey, CA 92291 

1 Xr. Ednood Htrts 
315DldXala 

Pen&syjvania State Caiverslty 
tbiverslty Pari, PA 16S22 

1 Dr. Leo Jtrndajr, Vice President 
American College 7esti»S! prpjraa 
P.O. Sox 163 
So»a Citr, lA S2240 

i Dr. Doi:ald A. a^onnas 

S^iverslty of California, Saa Dfe^o 
OcpsrtJ&eat cf Psychology 
UJolla, CA 9^37 

I «r. A. J. Pesdi, President 
^lectech Associates, inc. 
P.O. Sox 173 
' Kortij Stoninston. CT 06359 



Dr. Lawrence B. ^ofcfison 
Lavrence Johnson t Associates. Inc. 
2001 S Street. H.W.. Suite 502 
Vashingtcn. OC 20^)9 



^3 Dodd H«ll 
xlcrid* State CnSvcrsity 
Tallahassee, ?L 32306 



Dr. David Xlahr 
Camegie-Hellon University 
Depi rtaent of Psychology 
Pittsburgh, f/ 15213 



1 Dr. Diane H. SaicsQr-Klee 
R-K Xesearcb.^^ystefi Design 
3947 jlidgeocmt Drive 
Kalibu. a 90265 

Dr. Joseph V. Signey 
University of Southern California 
Behavioral Technology Laboratories 
3717 Scuth Grand 
Los Angeles. CA 90^07 

1 Dr. George £. Poland 
Poland and Coapasy. Inc. 
P.O. Box 61 
Haddonfleld. H3 CSD33 

1 Dr. Benjaicin Schneider 
University of Karyland 
Departjncnt of Psychology 
College Park. «) 20742 

1 Or. Arthur I. Siegel 

Applied Psychological Services 
404 East Lai^caster Avenue 
Wayne, Pa 19067 

1 Dr. Henry P. SIk. Jr. 
Pooh 6» - Business 
Indiana University 
Bloocington, IH 47401 



I 0r. SicS^rd Snoir 
Stanford Dalverslty 
Sc^l of Education 
Staisford, CA S4335 

3 J5r. CeoT^ ifiicaton 

Aacrlcan Instit^les for ?5eseardJ 
3301 Sko^ Iteico Avemir. «.V. 
«ashii^ton. OC 20D1S 

1 Dr. X. Kescourt 
Stanford University 
Icstit3/te for ffathetaetf cal St:rdies 

in the Social Sciences 
Stanford CA SC3» 

Cclle^e of Svriaers At5rir*5;t5at:.-n 
Lincols, XE i>«J£= 

1 Dr. J^xha J. Collins 
tice Pj^idetxt 
Essex Corporatiwi 
€3t^ Canicito t^trtlltio 
San DJ^. CA 92120 

1 Dr. Lyle Schoeafcldt 
Cepartneat of Psychology 
Cniversity of Georgia 
Athens^ Georgia 30602 

1 Dr. Patricia Suppes. Director 
Isstitste for i<athes4tical St:;dies 

in tht Social Sciences 
Stanford University 
Stanford. CA S43C5 «^ 



30 



